NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

A Closer Look at Model Collapse: From a Generalization-to-Memorization Perspective

Shi, Lianghe; Wu, Meng; Zhang, Huijie; Zhang, Zekai; Tao, Molei; Qu, Qing (September 2025, NeurIPS)

Free, publicly-accessible full text available September 18, 2026
Exploring Low-Dimensional Subspaces in Diffusion Models for Controllable Image Editing

Chen, Siyi; Zhang, Huijie; Guo, Minzhe; Lu, Yifu; Wang, Peng; Qu, Qing (December 2024, Advances in Neural Information Processing Systems)

Recently, diffusion models have emerged as a powerful class of generative models. Despite their success, there is still limited understanding of their semantic spaces. This makes it challenging to achieve precise and disentangled image generation without additional training, especially in an unsupervised way. In this work, we improve the understanding of their semantic spaces from intriguing observations: among a certain range of noise levels, (1) the learned posterior mean predictor (PMP) in the diffusion model is locally linear, and (2) the singular vectors of its Jacobian lie in low-dimensional semantic subspaces. We provide a solid theoretical basis to justify the linearity and low-rankness in the PMP. These insights allow us to propose an unsupervised, single-step, training-free LOw-rank COntrollable image editing (LOCO Edit) method for precise local editing in diffusion models. LOCO Edit identified editing directions with nice properties: homogeneity, transferability, composability, and linearity. These properties of LOCO Edit benefit greatly from the low-dimensional semantic subspace. Our method can further be extended to unsupervised or text-supervised editing in various text-to-image diffusion models (T-LOCO Edit). Finally, extensive empirical experiments demonstrate the effectiveness and efficiency of LOCO Edit.
more » « less
Full Text Available
Exploring Low-Dimensional Subspaces in Diffusion Models for Controllable Image Editing

Chen, Siyi; Zhang, Huijie; Guo, Minzhe; Lu, Yifu; Wang, Peng; Qu, Qing (December 2024, Advances in Neural Information Processing Systems)

Recently, diffusion models have emerged as a powerful class of generative models. Despite their success, there is still limited understanding of their semantic spaces. This makes it challenging to achieve precise and disentangled image generation without additional training, especially in an unsupervised way. In this work, we improve the understanding of their semantic spaces from intriguing observations: among a certain range of noise levels, (1) the learned posterior mean predictor (PMP) in the diffusion model is locally linear, and (2) the singular vectors of its Jacobian lie in low-dimensional semantic subspaces. We provide a solid theoretical basis to justify the linearity and low-rankness in the PMP. These insights allow us to propose an unsupervised, single-step, training-free LOw-rank COntrollable image editing (LOCO Edit) method for precise local editing in diffusion models. LOCO Edit identified editing directions with nice properties: homogeneity, transferability, composability, and linearity. These properties of LOCO Edit benefit greatly from the low-dimensional semantic subspace. Our method can further be extended to unsupervised or text-supervised editing in various text-to-image diffusion models (T-LOCO Edit). Finally, extensive empirical experiments demonstrate the effectiveness and efficiency of LOCO Edit.
more » « less
Full Text Available
Exploring Low-Dimensional Subspace in Diffusion Models for Controllable Image Editing

Chen, Siyi; Zhang, Huijie; Guo, Minzhe; Lu, Yifu; Wang, Peng; Qu, Qing (December 2024, Advances in Neural Information Processing Systems)

Full Text Available
Exploring Low-Dimensional Subspace in Diffusion Models for Controllable Image Editing

Chen, Siyi; Zhang, Huijie; Guo, Minzhe; Lu, Yifu; Wang, Peng; Qu, Qing (November 2024, 38th Conference on Neural Information Processing Systems (NeurIPS 2024))

Recently, diffusion models have emerged as a powerful class of generative models. Despite their success, there is still limited understanding of their semantic spaces. This makes it challenging to achieve precise and disentangled image generation without additional training, especially in an unsupervised way. In this work, we improve the understanding of their semantic spaces from intriguing observations: among a certain range of noise levels, (1) the learned posterior mean predictor (PMP) in the diffusion model is locally linear, and (2) the singular vectors of its Jacobian lie in low-dimensional semantic subspaces. We provide a solid theoretical basis to justify the linearity and low-rankness in the PMP. These insights allow us to propose an unsupervised, single-step, training-free LOw-rank COntrollable image editing (LOCO Edit) method for precise local editing in diffusion models. LOCO Edit identified editing directions with nice properties: homogeneity, transferability, composability, and linearity. These properties of LOCO Edit benefit greatly from the low-dimensional semantic subspace. Our method can further be extended to unsupervised or text-supervised editing in various text-to-image diffusion models (T-LOCO Edit). Finally, extensive empirical experiments demonstrate the effectiveness and efficiency of LOCO Edit. The code and the arXiv version can be found on the project website.
more » « less
Full Text Available
The Emergence of Reproducibility and Consistency in Diffusion Models

Zhang, Huijie; Zhou, Jinfan; Lu, Yifu; Guo, Minzhe; Shen, Liyue; Qu, Qing (June 2024, International Conference on Machine Learning)

Full Text Available
Improving Training Efficiency of Diffusion Models via Multi-Stage Framework and Tailored Multi-Decoder Architecture

Zhang, Huijie; Lu, Yifu; Alkhouri, Ismail; Ravishankar, Saiprasad; Song, Dogyoon; Qu, Qing (June 2024, Conference on Computer Vision and Pattern Recognition)

Diffusion models, emerging as powerful deep generative tools, excel in various applications. They operate through a two-steps process: introducing noise into training samples and then employing a model to convert random noise into new samples (e.g., images). However, their remarkable generative performance is hindered by slow training and sampling. This is due to the necessity of tracking extensive forward and reverse diffusion trajectories, and employing a large model with numerous parameters across multiple timesteps (i.e., noise levels). To tackle these challenges, we present a multi-stage framework inspired by our empirical findings. These observations indicate the advantages of employing distinct parameters tailored to each timestep while retaining universal parameters shared across all time steps. Our approach involves segmenting the time interval into multiple stages where we employ custom multi-decoder U-net architecture that blends time-dependent models with a universally shared encoder. Our framework enables the efficient distribution of computational resources and mitigates inter-stage interference, which substantially improves training efficiency. Extensive numerical experiments affirm the effectiveness of our framework, showcasing significant training and sampling efficiency enhancements on three state-of-the-art diffusion models, including large-scale latent diffusion models. Furthermore, our ablation studies illustrate the impact of two important components in our framework: (i) a novel timestep clustering algorithm for stage division, and (ii) an innovative multi-decoder U-net architecture, seamlessly integrating universal and customized hyperparameters.
more » « less
Full Text Available
An Iterative Semi-Supervised Approach with Pixel-wise Contrastive Loss for Road Extraction in Aerial Images

https://doi.org/10.1145/3606374

Zhang, Huijie; Li, Pu; Liu, Xiaobai; Yang, Xianfeng; An, Li (July 2023, ACM Transactions on Multimedia Computing, Communications, and Applications)

Extracting roads in aerial images has numerous applications in artificial intelligence and multimedia computing, including traffic pattern analysis and parking space planning. Learning deep neural networks, though very successful, demands vast amounts of high-quality annotations, of which acquisition is time-consuming and expensive. In this work, we propose a semi-supervised approach for image-based road extraction where only a small set of labeled images are available for training to address this challenge. We design a pixel-wise contrastive loss to self-supervise the network training to utilize the large corpus of unlabeled images. The key idea is to identify pairs of overlapping image regions (positive) or non-overlapping image regions (negative) and encourage the network to make similar outputs for positive pairs or dissimilar outputs for negative pairs. We also develop a negative sampling strategy to filter false negative samples during the process. An iterative procedure is introduced to apply the network over raw images to generate pseudo-labels, filter and select high-quality labels with the proposed contrastive loss, and re-train the network with the enlarged training dataset. We repeat these iterative steps until convergence. We validate the effectiveness of the proposed methods by performing extensive experiments on the public SpaceNet3 and DeepGlobe Road datasets. Results show that our proposed method achieves state-of-the-art results on public image segmentation benchmarks and significantly outperforms other semi-supervised methods.
more » « less
Full Text Available
Global hidden spillover effects among concurrent green initiatives

https://doi.org/10.1016/j.scitotenv.2024.169880

An, Li; Liu, Jianguo; Zhang, Qi; Song, Conghe; Ezzine-de-Blas, Driss; Dai, Jie; Zhang, Huijie; Lewison, Rebecca; Bohnett, Eve; Stow, Douglas; et al (March 2024, Science of The Total Environment)

Concurrently implemented green initiatives to combat global environmental crises may be curtailed or even sacrificed given the ongoing global economic contraction. We collected empirical data and information about green initiatives from 15 sites or countries worldwide. We systematically explored how specific policy, intended behaviors, and gains of given green initiative may interact with those of other green initiatives concurrently implemented in the same geographic area or involving the same recipients. Surprisingly, we found that spillover effects were very divergent: one initiative could reduce the gain of another by 22 % ~ 100 %, representing alarming losses, while in other instances, substantial co-benefits could arise as one initiative can increase the gain of another by 9 % ~ 310 %. Leveraging these effects will help countries keep green initiatives with significant co-benefits but stop initiatives with substantial spillover losses in the face of widespread budget cuts, better meeting the United Nations’ sustainable development goals.
more » « less
Full Text Available
Neighborhood impacts on household participation in payments for ecosystem services programs

Zhang, Huijie; Bilsborrow, Richard; Chun, Yongwan; Yang, Shuang; Dai, Jie (June 2021, Journal of geographical sciences)
null (Ed.)
Payments for Ecosystem Services (PES) programs have been implemented in both developing and developed countries to conserve ecosystems and the vital services they provide. These programs also often seek to maintain or improve the economic wellbeing of the populations living in the corresponding (usually rural) areas. Previous studies suggest that PES policy design, presence or absence of concurrent PES programs, and a variety of socioeconomic and demographic factors can influence decisions of households to participate or not in the PES program. However, neighborhood impacts on household participation in PES have rarely been addressed. This study explores potential neighborhood effects on villagers’ enrollment in the Grain-to-Green Program (GTGP), one of the largest PES programs in the world, using data from China’s Fanjingshan National Nature Reserve. We utilize a fixed effects logistic regression model in combination with the eigenvector spatial filtering (ESF) method to explore whether neighborhood size affects household enrollment in GTGP. By comparing the results with and without ESF, we find that the ESF method can help account for spatial autocorrelation properly and reveal neighborhood impacts that are otherwise hidden, including the effects of area of forest enrolled in a concurrent PES program, gender and household size. The method can thus uncover mechanisms previously undetected due to not taking into account neighborhood impacts and thus provides an additional way to account for neighborhood impacts in PES programs and other studies.
more » « less
Full Text Available

« Prev Next »

Search for: All records